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ACCESSION NUMBER: 
DOCUMENT NUMBER: 
TITLE: 



INVENTOR (S) : 

PATENT ASSIGNEE (S) : 
SOURCE : 

DOCUMENT TYPE: 
LANGUAGE : 

FAMILY ACC. NUM. COUNT: 
PATENT INFORMATION: 



2005:140786 CAPLUS Full -text 
142:233279 ~~ '~ 

Phage-display peptides as novel antimicrobial agents 
against Haemophilus influenzae, and uses in 
identifying bacterial receptors and genes encoding the 
same 

Bishop -hurley, Sharon L. ; Schmidt, Francis J.; Smith, 
Arnold L. 

The Curators of the University of Missouri, USA 

U.S. Pat. Appl . Publ., 24 pp. 

CODEN: USXXCO 

Patent 

English 

1 



PATENT NO. 



KIND DATE 



APPLICATION NO. 



DATE 



US 2005037972 
PRIORITY APPLN. INFO. : 
ED 



Al 



20050217 



US 2003-655562 20030904 
US 2002-409909P P 20020911 

Entered STN: 18 Feb 2005 
AB Whole cell phage-display techniques were used to identify several peptides 

that bound preferentially to a non-typeable strain of Haemophilus influenzae. 
These peptides were able to inhibit growth of both H. influenzae and 
Staphylococcus aureus. Thus, methods for treating bacterial infections, alone 
or in combination with traditional antibiotics, are envisioned. Also provided 
is a method for identifying a bacterial receptor comprising (a) providing a 
sample suspected of comprising a bacterial receptor; (b) providing a peptide 
comprising the sequence KQRDSRSGYTAPTLV, KKSHHPSSEWGLNLT, GRHRTS VPTDE VF I T , 
KQRTS I RATEGCLPS , RNHGTDRATTI PPLS , WFLSSRNSAVFTDF , GSRGKHTFVRPTLVF, or 
FISYSSPS HMGARMR ; (c) contacting the sample with the peptide; and (d) 
identifying a receptor that binds to the peptide. The sample may be a whole 
bacterium or a bacterial cell wall. The peptide may be fixed to a support, 
such as a filter, a column, a bead, a dipstick or a gel. The method may 
further comprise degradative sequencing of said identified bacterial receptor, 
may further comprise designing a degenerative probe based on the sequence of 
said identified receptor, may further comprise using the degenerative probe to 
identify the gene encoding the identified receptor. 
IT 845509-27-5P 

RL: ARG (Analytical reagent use) ; BPN (Biosynthetic preparation) ; DEV 
(Device component use); PAC (Pharmacological activity); PRP (Properties); 
THU (Therapeutic use) ; ANST (Analytical study) ; BIOL (Biological study) ; 
PREP (Preparation) ; USES (Uses) 



(amino acid sequence, antimicrobial peptide; phage-display peptides as 
novel antimicrobial agents against Haemophilus influenzae, and uses in 
identifying bacterial receptors) 
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AUTHOR (S) 



CORPORATE SOURCE: 



SOURCE : 



PUBLISHER: 
DOCUMENT TYPE: 
LANGUAGE : 
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AB 



Peptides selected for binding to a virulent strain of 
Haemophilus influenzae by phage display are 
bactericidal 

Bishop -Hurley, Sharon L.; Schmidt, Francis J.; Erwin, 
Alice L.; Smith, Arnold L. 

CSIRO Livestock Industries, Rockhampton, 4702, 
Australia 

Antimicrobial Agents and Chemotherapy (2005), 49(7), 
2972-2978 

CODEN: AMACCQ; ISSN: 0066-4804 
American Society for Microbiology 
Journal 
English 
Entered STN: 08 Jul 2005 

Nontypeable H. influenzae (NTHi) is an obligate parasite of the oropharynx of 
humans, in whom it commonly causes mucosal infections such as otitis media, 
sinusitis, and bronchitis. We used a subtractive phage display approach to 
affinity select for peptides binding to the cell surface of a novel invasive 
NTHi strain R2866 (also called Intl) . Over half of the selected phage 
peptides tested were bactericidal toward R2866 in a dose -dependent manner. 
Five of the clones encoded the same peptide sequence ( KQRTS I RATEGCLPS ; clone 
hi3/17) , while the remaining 4 clones encoded unique peptides. All of the 
bactericidal phage peptides but one were cationic and had similar phys.- 
chemical properties. Clone hi3/17 possessed a similar level of activity 
toward a panel of clin. NTHi isolates and H. influenzae type b strains but 
lacked bactericidal activity toward gram-pos . (Enterococcus faecalis, 
Staphylococcus aureus) and gram-neg. (Proteus mirabilis, Pseudomonas 
aeruginosa, and Salmonella enterica) bacteria. These data indicate that 
peptides binding to bacterial surface structures isolated by phage display may 
prove of value in developing new antibiotics. 
845509-27-5 

RL: BSU (Biological study, unclassified) ; BIOL (Biological study) 

(peptides binding to a virulent strain of Haemophilus influenzae by 
phage display are bactericidal) 
REFERENCE COUNT: 37 THERE ARE 37 CITED REFERENCES AVAILABLE FOR THIS 

RECORD. ALL CITATIONS AVAILABLE IN THE RE FORMAT 
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Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 
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US-10-655-562A-4 
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1 KQRTS I RATEGCLPS 15 



Search time 197 Seconds 
(without alignments) 
34.813 Million cell updates/sec 



BLOSUM62 
Gapop 10.0 



Gapext 0.5 

Searched: 2589679 seqs, 457216429 residues 

Total number of hits satisfying chosen parameters: 



2589679 



Minimum DB seq length: 
Maximum DB seq length: 



2000000000 



Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 
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geneseqpl980s : * 
geneseqpl990s : * 
geneseqp2000s : * 
geneseqp2001s : * 
geneseqp2002s : * 
geneseqp2003as : * 
geneseqp2003bs : * 
geneseqp2004s : * 
geneseqp2005s : * 
geneseqp2006s : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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RESULT 1 
ADY21128 

ID ADY21128 standard; peptide; 15 AA. 
XX 

AC ADY21128; 
XX 

DT 05-MAY-2005 (first entry) 
XX 

DE Haemophilus influenza-binding phage display method peptide #14. 
XX 

KW antibacterial; phage display; protein interaction; 

KW haemophilus influenzae infection; staphylococcus aureus infection. 

XX 

OS Unidentified. 



XX 

PN US2005037972-A1 . 
XX 

PD 17-FEB-2005. 
XX 

PF 04-SEP-2003; 2003US-00655562 . 
XX 

PR ll-SEP-2002; 2002US-0409909P . 
XX 

PA (UMOR ) UNIV MISSOURI . 
XX 

PI Bishop-Hurley SL, Schmidt FJ, Smith AL; 
XX 

DR WPI; 2005-172291/18. 
XX 

PT Novel isolated peptide derived from Haemophilus influenzae, useful for 

PT inhibiting growth of Staphylococcal or Haemophilus species such as 

PT Staphylococcus aureus or H. influenzae, and treating/preventing bacterial 

PT infection in subject. 

XX 

PS Claim 1; Page 15; 24pp ; English. 
XX 

CC The invention relates to an isolated peptide (I) derived from Haemophilus 

CC influenzae, and comprising 15-50 residues of any one of 8 fully defined 

CC sequences given in specification. (I) is useful for inhibiting the growth 

CC of a Staphylococcal or Haemophilus sp. such as Staphylococcus aureus or 

CC H. influenzae . The peptide is 15-50 residues, preferably 15 residues in 

CC length. The method involves contacting the species with (I) , and 

CC contacting the species with a chemopharmaceutical antibiotic. (I) is 

CC useful for treating or preventing a bacterial infection in a subject, 

CC which involves contacting the subject with (I), to inhibit the growth of 

CC bacteria in vivo . (I) is useful for preventing bacterial growth in a 

CC solution or bacterial attachment or growth on an abiotic surface, which 

CC involves mixing the solution with (I) or coating the abiotic surface with 

CC (I) to inhibit the growth of bacteria in vivo . The surface is part of a 

CC medical device. (I) is useful for identifying a bacterial receptor in a 

CC sample, which involves providing a sample suspected of comprising a 

CC bacterial receptor, contacting the sample with (I), and identifying a 

CC receptor that binds to (I) . The sample is a whole bacterium or bacterial 

CC cell wall. (I) is fixed to a support such as a filter, column, bead, 

CC dipstick or gel. The method further involves degradative sequencing of 

CC the identified receptor, designing a degenerative probe based on the 

CC sequence of the identified receptor and using the degenerative probe to 

CC identify the gene encoding the identified receptor. (Note: this sequence 

CC is given as SEQ ID NO: 4 in the claims of the patent but does not 

CC corresponds to the sequence given as SEQ ID NO: 4 in the Sequence Listing 

CC of the specification) . 
XX 

SQ Sequence 15 AA; 

Query Match 100.0%; Score 77; DB 9; Length 15; 

Best Local Similarity 100.0%; Pred. No. 2.8e-06; 

Matches 15; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 KQRTS I RATEGCLPS 15 

1 1 1 1 1 M 1 1 1 1 1 1 1 1 

Db 1 KQRTS I RATEGCLPS 15 
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ID AAM96603 standard; protein; 109 AA. 
XX 

AC AAM96603; 
XX 

DT 21-NOV-2001 (first entry) 
XX 

DE Human reproductive system related antigen SEQ ID NO: 5261. 
XX 

KW Human; reproductive system related antigen; reproductive system disorder; 

KW cancer; gene therapy. 

XX 

OS Homo sapiens . 
XX 

PN WO200155320-A2. 
XX 

PD 02-AUG-2001. 
XX 

PF 17-JAN-2001; 2001WO-US001339 . 
XX 

PR 31-JAN-2000; 2 000US- 0179065P . 

PR 04-FEB-2000; 2000US-0180628P . 

PR 24-FEB-2000; 2000US-0184664P . 

PR 02-MAR-2000; 2000US-0186350P . 

PR 16-MAR-2000; 2000US-0189874P . 
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PR 17-NOV-2000; 2000US-0249214P . 

PR 17-NOV-2000; 2000US-0249215P . 

PR 17-NOV-2000; 2000US-0249216P . 

PR 17-NOV-2000; 2000US-0249217P . 

PR 17-NOV-2000; 2000US-0249218P . 

PR 17-NOV-2000; 2000US-0249244P . 

PR 17-NOV-2000; 2000US-0249245P . 

PR 17-NOV-2000; 2000US-0249264P . 

PR 17-NOV-2000; 2000US-0249265P . 

PR 17-NOV-2000; 2000US-0249297P. 

PR 17-NOV-2000; 2000US-0249299P . 

PR 17-NOV-2000; 2000US-0249300P . 

PR 01-DEC-2000; 2000US-0250160P . 

PR 01-DEC-2000; 2000US-0250391P . 

PR 05-DEC-2000; 2000US-0251030P . 

PR 05-DEC-2000; 2000US- 0251988P . 

PR 05-DEC-2000; 2000US- 0256719P . 

PR 06-DEC-2000; 2000US- 0251479P . 

PR 08-DEC-2000; 2000US- 0251856P . 

PR 08-DEC-2000; 2000US- 0251868P . 

PR 08-DEC-2000; 2000US- 0251869P . 

PR 08-DEC-2000; 2000US-0251989P . 

PR 08-DEC-2000; 2000US-0251990P . 

PR ll-DEC-2000; 2000US-0254097P . 

PR 05-JAN-2001; 2001US-0259678P . 
XX 

PA (HUMA-) HUMAN GENOME SCI INC. 
XX 

PI Rosen CA, Barash SC, Ruben SM; 
XX 

DR WPI; 2001-465570/50. 

DR N-PSDB; AAL02573 . 
XX 

PT Isolated nucleic acid molecule encoding a reproductive system antigen is 

PT used in preventing, treating or ameliorating a medical condition. 

XX 

PS Claim 11; SEQ ID NO 5261; 1297pp + Sequence Listing; English. 



XX 

CC The present invention provides the protein and coding sequences of a 

CC number of human reproductive system related antigens. These can be used 

CC in the prevention and treatment of reproductive system disorders, 

CC including cancer. The present sequence is a protein of the invention 
XX 

SQ Sequence 109 AA; 

Query Match 54.5%; Score 42; DB 4; Length 109; 

Best Local Similarity 63.6%; Pred. No. 30; 

Matches 7; Conservative 3; Mismatches 1; Indels 0; Gaps 0; 

Qy 4 TSIRATEGCLP 14 

hi Ihlhl 
Db 94 TAILATKGCIP 104 



Search completed: June 6, 2006, 05:15:55 
Job time : 200 sees 

GenCore version 5.1.9 
Copyright (c) 1993 - 2006 Biocceleration Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 



June 6, 2006, 05:21:10 ; Search time 50 Seconds 

(without alignments) 
26.259 Million cell updates/sec 

US-10-655-562A-4 
77 

1 KQRTS I RATEGCLPS 15 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 



650591 seqs, 87530628 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



650591 



Database : 



Issued_ 
/EMC 
/EMC 
/EMC 
/EMC 
/EMC 
/EMC 
/EMC 



Patents_AA: * 
Celerra_SIDS3 
Celerra_SIDS3 
Celerra_SIDS3 
Celerra_SIDS3 
Celerra_SIDS3 
Celerra_SIDS3 
Celerra SIDS3 



/ptodata/2/iaa/5_COMB.pep: * 
/ptodata/2/iaa/6_C0MB.pep: * 
/ptodata/2/iaa/7_COMB .pep : * 
/ptodata/2/iaa/H_COMB.pep: * 
I /p t oda t a / 2 / iaa / PCTUS_COMB . pep : * 
/pt oda ta /2 / iaa /RE_C0MB . pep : * 
/ptodata/2/iaa/backf ilesl .pep: * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 



NO. 


Score 


Match Length DB 


ID 








Description 


1 


42 


54. 


.5 


55 


2 


US- 


09 


-621 


-976-7108 


Sequence 


7108, Ap 


2 


42 


54. 


.5 


88 


2 


US- 


09 


-621 


-976-5239 


Sequence 


5239, Ap 


3 


42 


54. 


.5 


120 


2 


US- 


09 


-621 


-976-5688 


Sequence 


5688, Ap 


4 


42 


54. 


.5 


450 


2 


US- 


09 


-252 


-991A-26556 


Sequence 


26556, A 


5 


42 


54 . 


.5 


854 


2 


US- 


09 


-206 


-551-16 


Sequence 


16, Appl 


6 


41 


53. 


,2 


94 


2 


US- 


09 


-489 


-039A-8163 


Sequence 


8163, Ap 


7 


39 


50. 


.6 


161 


2 


US- 


09 


-270 


-767-33980 


Sequence 


33980, A 


8 


39 


50. 


.6 


161 


2 


US- 


09 


-270 


-767-49197 


Sequence 


49197, A 


9 


39 


50. 


.6 


268 


2 


US- 


09 


-252 


-991A-31279 


Sequence 


31279, A 


10 


39 


50. 


.6 


291 


2 


US- 


09 


-252 


-991A-19371 


Sequence 


19371, A 


11 


39 


50. 


.6 


363 


2 


us- 


09 


-205 


-258-553 


Sequence 


553, App 


12 


39 


50. 


.6 


363 


2 


us- 


10 


-004 


-860-553 


Sequence 


553, App 


13 


39 


50. 


.6 


624 


2 


us- 


09 


-252 


-991A-23659 


Sequence 


23659, A 


14 


38 


49. 


.4 


165 


2 


us- 


09 


-252 


-991A-17601 


Sequence 


17601, A 


15 


37 


48. 


. 1 


72 


2 


us- 


08 


-469 


-260A-453 


Sequence 


453, App 


16 


37 


48. 


. 1 


72 


2 


us- 


08 


-488 


-446-453 


Sequence 


453, App 


17 


37 


48. 


. 1 


72 


2 


us- 


08 


-467 


-344A-453 


Sequence 


453, App 


18 


37 


48. 


. 1 


72 


2 


us- 


08 


-424 


-550B-453 


Sequence 


453, App 


19 


37 


48. 


. 1 


91 


2 


us- 


09 


-376 


-781-2 


Sequence 


2, Appli 


20 


37 


48. 


. 1 


135 


2 


us- 


09 


-252 


-991A-22855 


Sequence 


22855, A 


21 


37 


48 , 


. 1 


532 


2 


us- 


09 


-533 


-427-6 


Sequence 


6, Appli 


22 


37 


48 , 


. 1 


532 


2 


us- 


•09 


-717 


-789C-6 


Sequence 


6, Appli 


23 


37 


48 , 


. 1 


588 


2 


us- 


09 


-533 


-427-5 


Sequence 


5, Appli 


24 


37 


48. 


. 1 


588 


2 


us- 


09 


-717 


-789C-5 


Sequence 


5, Appli 


25 


37 


48. 


. 1 


724 


2 


us- 


•09 


-533 


-427-4 


Sequence 


4 , Appl i 


26 


37 


48 . 


. 1 


724 


2 


us- 


09 


-717 


-789C-4 


Sequence 


4, Appli 


27 


36 


46. 


.8 


78 


2 


us- 


09 


-621 


-976-5240 


Sequence 


5240, Ap 


28 


36 


46. 


.8 


192 


2 


us- 


09 


-252 


-991A-27287 


Sequence 


27287, A 


29 


36 


46. 


.8 


354 


2 


us- 


09 


-134 


-000C-3663 


Sequence 


3663, Ap 


30 


36 


46. 


.8 


361 


1 


us- 


08 


-415 


-751-7 


Sequence 


7, Appli 


31 


36 


46 . 


. 8 


363 


2 


us- 


09 


-252 


-991A-30821 


Sequence 


30821, A 


32 


36 


46. 


.8 


393 


2 


us- 


09 


-248 


-796A-16799 


Sequence 


16799, A 


33 


36 


46. 


.8 


505 


2 


us- 


09 


-252 


-991A-23615 


Sequence 


23615, A 


34 


36 


46. 


8 


605 


2 


us- 


09 


-252 


-991A-19462 


Sequence 


19462, A 


35 


36 


46. 


,8 


1278 


2 


us- 


09 


-462 


-136-2 


Sequence 


2, Appli 


36 


36 


46. 


,8 


1318 


2 


us- 


09 


-949 


-016-10152 


Sequence 


10152, A 


37 


35 


45. 


.5 


22 


2 


us- 


09 


-205 


-258-884 


Sequence 


884, App 


38 


35 


45, 


,5 


22 


2 


us- 


10 


-004 


-860-884 


Sequence 


884, App 


39 


35 


45. 


5 


59 


1 


us- 


08 


-018 


-129-7 


Sequence 


7, Appli 


40 


35 


45. 


.5 


59 


1 


us- 


08 


-448 


-250-7 


Sequence 


7, Appli 


41 


35 


45. 


5 


59 


2 


us- 


09 


-282 


-257-7 


Sequence 


7, Appli 


42 


35 


45. 


5 


83 


2 


us- 


09 


-674 


-973A-344 


Sequence 


344, App 


43 


35 


45. 


5 


84 


2 


us- 


09 


-674 


-973A-343 


Sequence 


343, App 


44 


35 


45. 


5 


91 


2 


us- 


09- 


-674 


-973A-348 


Sequence 


348, App 


45 


35 


45. 


5 


92 


2 


us- 


09- 


-674 


-973A-347 


Sequence 


347, App 



ALIGNMENTS 



RESULT 1 

US-09-621-976-7108 

Sequence 7108 , Application US/09621976 
Patent No. 6639063 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 



Dumas Milne Edwards, J.B. 
Jobert , S . 
Giordano, J.Y. 
TITLE OF INVENTION: ESTs and Encoded Human Proteins. 
FILE REFERENCE: GENSET . 054PR2 
CURRENT APPLICATION NUMBER: US/09/621,976 
CURRENT FILING DATE: 2000-07-21 
NUMBER OF SEQ ID NOS : 19335 
SOFTWARE: Patent. pm 
SEQ ID NO 7108 
LENGTH: 55 
TYPE: PRT 

ORGANISM: Homo sapiens 

FEATURE: 

NAME/ KEY: UNSURE 
LOCATION: 27 

OTHER INFORMATION: Xaa = Cys,Phe 
NAME/ KEY: UNSURE 
LOCATION: 53 

OTHER INFORMATION: Xaa = Met,Arg 
US-09-621-976-7108 

Query Match 54.5%; Score 42; DB 2; Length 55; 

Best Local Similarity 63.6%; Pred. No. 2.9; 

Matches 7; Conservative 3; Mismatches 1; Indels 0; Gaps 0; 

Qy 4 TSIRATEGCLP 14 

hi Ihlhl 
Db 4 0 TAILATKGCIP 50 



Search completed: June 6, 2006, 05:22:34 
Job time : 51 sees 

GenCore version 5.1.9 
Copyright (c) 1993 - 2006 Biocceleration Ltd. 



OM protein - protein search, using sw model 

Rum on: June 6, 2006, 05:33:06 ; Search time 184 Seconds 

(without alignments) 
37.762 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 



US-10-655-562A-4 
77 

1 KQRTS I RATEGCLPS 15 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 2097797 seqs, 463214858 residues 

Total number of hits satisfying chosen parameters: 2097797 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post -processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : Published_Applications_AA_Main: * 

1 : /EMC_Celerra_SIDS3/ptodata/2/pubpaa/US07_PUBCOMB.pep: * 

2 : /EMC_Celerra_SIDS3/ptodata/2/pubpaa/US08_PUBCOMB.pep: * 

3 : /EMC_Celerra_SIDS3/ptodata/2/pubpaa/US09_PUBCOMB.pep: * 

4 : /EMC__Celerra_SIDS3/ptodata/2/pubpaa/US10A_PUBCOMB.pep: * 

5 : /EMC_Celerra_SIDS3/ptodata/2/pubpaa/US10B_PUBCOMB.pep:* 

6 : /EMC_Celerra_SIDS3/ptodata/2/pubpaa/USll_PUBCOMB.pep: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 



Result 
No. 


Score 


Query 

Match Length DB 


ID 






Description 


1 


44 


57 


1 


78 


4 


US-10-437 


-963 


-115135 


Sequence 


115135, 


2 


43 


55 


8 


80 


4 


US-10-425 


-115 


-215070 


Sequence 


215070, 


3 


42 


54 


5 


109 


3 


US-09-764 


-891 


-5261 


Sequence 


5261, Ap 


4 


42 


54 


5 


117 


4 


US-10-437 


-963 


-135764 


Sequence 


135764, 


5 


42 


54 


5 


208 


4 


US-10-425 


-115 


-276978 


Sequence 


276978, 


6 


42 


54 


5 


249 


3 


US-09-745 


-763 


-9 


Sequence 


9, Appli 


7 


42 


54 


5 


249 


4 


US-10-028 


-072 


-110 


Sequence 


110, App 


8 


42 


54 


5 


249 


4 


US-10-140 


-808 


-110 


Sequence 


110, App 


9 


42 


54 


5 


249 


4 


US-10-121 


-049 


-110 


Sequence 


110, App 


10 


42 


54 


5 


249 


4 


US-10-123 


-904 


-110 


Sequence 


110, App 


11 


42 


54 


5 


249 


4 


US-10-140 


-470 


-110 


Sequence 


110, App 


12 


42 


54 


5 


249 


4 


US-10-175 


-746 


-110 


Sequence 


110, App 


13 


42 


54 


5 


249 


4 


US-10-176 


-918 


-110 


Sequence 


110, App 


14 


42 


54 


5 


249 


4 


US-10-176 


-921 


-110 


Sequence 


110, App 


15 


42 


54 


5 


249 


4 


US-10-137 


-865 


-110 


Sequence 


110, App 


16 


42 


54 


5 


249 


4 


US-10-140 


-474 


-110 


Sequence 


110, App 


17 


42 


54 


5 


249 


4 


US-10-142 


-431 


-110 


Sequence 


110, App 


18 


42 


54 


5 


249 


4 


US-10-143 


-114 


-110 


Sequence 


110, App 


19 


42 


54. 


5 


249 


4 


US-10-142 


-419 


-110 


Sequence 


110, App 


20 


42 


54 


5 


249 


4 


US-10-123 


-262 


-110 


Sequence 


110, App 


21 


42 


54 


5 


249 


4 


US-10-142 


-423 


-110 


Sequence 


110, App 


22 


42 


54. 


5 


249 


4 


US-10-121 


-050 


-110 


Sequence 


110, App 


23 


42 


54 . 


5 


249 


4 


US-10-141 


-755 


-110 


Sequence 


110, App 


24 


42 


54 . 


5 


249 


4 


US-10-143 


-032 


-110 


Sequence 


110, App 


25 


42 


54 . 


5 


249 


4 


US-10-123- 


-108 


-110 


Sequence 


110, App 


26 


42 


54. 


5 


249 


4 


US-10-123- 


-236 


-110 


Sequence 


110, App 



27 


42 


54 


5 


249 


4 


US- 


10- 


123 


-261-110 


Sequence 


110, 


App 


28 


42 


54 


5 


249 


4 


US- 


10- 


140 


-921-110 


Sequence 


110, 


App 


29 


42 


54 


5 


249 


4 


US- 


10- 


140 


-928-110 


Sequence 


110, 


App 


30 


42 


54 


5 


249 


4 


US- 


10- 


121 


-045-110 


Sequence 


110, 


App 


31 


42 


54 


5 


249 


4 


US- 


10- 


123 


-292-110 


Sequence 


110, 


App 


32 


42 


54 


5 


249 


4 


us- 


10- 


123 


-903-110 


Sequence 


110, 


App 


33 


42 


54 


5 


249 


4 


us- 


10- 


124 


-819-110 


Sequence 


110, 


App 


34 


42 


54 


5 


249 


4 


us- 


10- 


124 


-822-110 


Sequence 


110, 


App 


35 


42 


54 


5 


249 


4 


us- 


10- 


140 


-925-110 


Sequence 


110, 


App 


36 


42 


54 


5 


249 


4 


us- 


10- 


160 


-498-110 


Sequence 


110, 


App 


37 


42 


54 


5 


249 


4 


us- 


10- 


124 


-824-110 


Sequence 


110, 


App 


38 


42 


54 


5 


249 


4 


us- 


10- 


127 


-825A-110 


Sequence 


110, 


App 


39 


42 


54 


5 


249 


4 


us- 


10- 


127 


-829A-110 


Sequence 


110, 


App 


40 


42 


54 


5 


249 


4 


us- 


10- 


127 


-835A-110 


Sequence 


110, 


App 


41 


42 


54 


5 


249 


4 


us- 


10- 


127 


-839A-110 


Sequence 


110, 


App 


42 


42 


54 


5 


249 


4 


us- 


10- 


127 


-901A-110 


Sequence 


110, 


App 


43 


42 


54 


5 


249 


4 


us- 


10- 


128 


-693A-110 


Sequence 


110, 


App 


44 


42 


54 


5 


249 


4 


us- 


10- 


131 


-813A-110 


Sequence 


110, 


App 


45 


42 


54 


5 


249 


4 


us- 


10- 


131 


-818A-110 


Sequence 


110, 


App 



ALIGNMENTS 



RESULT 1 

US-10-437-963-115135 

Sequence 115135, Application US/10437963 
Publication No. US20040123343A1 
GENERAL INFORMATION: 
APPLICANT: La Rosa, Thomas J. 
APPLICANT: Kovalic, David K. 
APPLICANT: Zhou, Yihua 
APPLICANT: Cao, Yongwei 
APPLICANT: Wu, Wei 
APPLICANT: Boukharov, Andrey A. 
APPLICANT: Barbazuk, Brad 
APPLICANT: Li, Ping 

TITLE OF INVENTION: Rice Nucleic Acid Molecules and Other Molecules 
Associated With 

TITLE OF INVENTION: Plants and Uses Thereof for Plant Improvement 
FILE REFERENCE: 38 -21 { 53221) B 
CURRENT APPLICATION NUMBER: US/10/437 , 963 
CURRENT FILING DATE: 2003-05-14 
NUMBER OF SEQ ID NOS : 204 966 
SEQ ID NO 115135 
LENGTH: 78 
TYPE: PRT 

ORGANISM: Oryza sativa 
FEATURE : 

OTHER INFORMATION: Clone ID: PAT_MRT4530_18760C . 1 .pep 
US-10-437-963-115135 



Query Match 57.1%; 
Best Local Similarity 57.1%; 
Matches 8; Conservative 



Score 44; DB 4; 
Pred. No. 10; 
2; Mismatches 



Length 78; 
4; Indels 



0 ; Gaps 



0; 



Qy 



1 KQRTS I RATEGCLP 14 



Db 



Ilhl : III I 
53 KQRSSRKGGEGCFP 66 



Search completed: June 6, 2006, 05:36:25 
Job time : 185 sees 

GenCore version 5.1.9 
Copyright (c) 1993 - 2006 Biocceleration Ltd. 



OM protein - protein search, using sw model 



Run on: 



June 6, 2006, 05:33:25 



Search time 15 Seconds 
(without alignments) 
11.565 Million cell updates/sec 



Title: US-10-655-562A-4 
Perfect score: 77 
Sequence : 

Scoring table: 
Searched: 



1 KQRTS I RATEGCLPS 15 
BLOSUM62 

Gapop 10.0 , Gapext 0.5 
58871 seqs, 11565156 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



58871 



Post -processing: 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 



Database : 



Published_Applications_AA__New: * 

_S I DS 3 /pt oda t a / 2 /pubpaa /US 0 9_NEW_PUB . pep : * 
_S I DS3 /pt oda t a / 2 /pubpaa /US 0 6_NEW_PUB . pep : * . 
_S I DS3 /pt oda t a /2 /pubpaa /US 0 7 JSFEW_PUB . pep : * 
_S I DS3 /pt oda t a /2 /pubpaa /US 0 8_NEW_PUB . pep : * 
_S I DS3 /pt oda t a / 2 /pubpaa / PCT_NEW_PUB . pep : * 
_S I DS3 /p t oda t a / 2 /pubpaa /US 1 0_NEW_PUB . pep : * 
_S I DS3 /ptoda t a / 2 /pubpaa /US 1 1__NEW_PUB . pep : * 
_SIDS3/ptodata/2/pubpaa/US60_NEW_PUB.pep: * 



/EMC_Celerra_ 

/EMC_Celerra_ 

/EMC_Celerra_ 

/EMC__Celerra 

/EMC_Celerra~ 

/EMC_Celerra~ 

/EMC_Celerra_ 

/EMC Celerra 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



Result 
No. 



Score 



Query 

Match Length DB 



ID 



Description 



1 39 50.6 221 7 US-11-293 -697-2989 Sequence 2989, Ap 

2 37 48.1 324 6 US-10-953-349-22573 Sequence 22573, A 



3 


37 


48 


.1 


439 


6 


US- 


10 


-953 


-349 


-15731 


Sequence 


15731, A 


4 


37 


48 


.1 


452 


6 


us- 


10 


-953 


-349 


-15730 


Sequence 


15730, A 


5 


36 


46 


.8 


151 


7 


us- 


11 


-293 


-697 


-3091 


Sequence 


3091, Ap 


6 


36 


46 


8 


234 


6 


us- 


10 


-953 


-349 


-25569 


Sequence 


25569, A 


7 


36 


46 


8 


432 


7 


us- 


11 


-293 


-697 


-3406 


Sequence 


3406, Ap 


8 


35 


45 


5 


364 


6 


us- 


10 


-953 


-349 


-10990 


Sequence 


10990, A 


9 


35 


45 


5 


529 


7 


us- 


11 


-312 


-958 


-26 


Sequence 


26, Appl 


10 


35 


45 


5 


945 


7 


us- 


11 


-293 


-697 


-2855 


Sequence 


2855, Ap 


11 


35 


45 


5 


945 


7 


us- 


11 


-293 


-697 


-3079 


Sequence 


3079, Ap 


12 


34 


44 


2 


236 


7 


us- 


11 


-293 


-697 


-4829 


Sequence 


4829, Ap 


13 


34 


44 


2 


338 


6 


us- 


10 


-953 


-349 


-2039 


Sequence 


2039, Ap 


14 


34 


44 


2 


340 


6 


us- 


10 


-953 


-349 


-2038 


Sequence 


2038, Ap 


15 


34 


44 


2 


769 


6 


us- 


10 


-511 


-937 


-3015 


Sequence 


3015, Ap 


16 


34 


44 


2 


813 


7 


us- 


11 


-293 


-697 


-3901 


Sequence 


3901, Ap 


17 


33.5 


43 


5 


247 


6 


us- 


10 


-953 


-349 


-16451 


Sequence 


16451, A 


18 


33.5 


43 


5 


258 


6 


us- 


10 


-953 


-349 


-16450 


Sequence 


16450, A 


19 


33.5 


43 


5 


319 


6 


us- 


10 


-953 


-349 


-16449 


Sequence 


16449, A 


20 


33 


42 


9 


157 


6 


us- 


10 


-953 


-349 


-27909 


Sequence 


27909, A 


21 


33 


42 


9 


233 


6 


us- 


10 


-953 


-349 


-16422 


Sequence 


16422, A 


22 


33 


42 


9 


295 


7 


us- 


11 


-242 


-111 


-24 


Sequence 


24, Appl 


23 


33 


42 


9 


432 


6 


us- 


10 


-196 


-749 


-74 


Sequence 


74, Appl 


24 


33 


42 


9 


562 


6 


us- 


10 


-953 


-349 


-20235 


Sequence 


20235, A 


25 


33 


42 


9 


568 


6 


us- 


10 


-953 


-349 


-20234 


Sequence 


20234, A 


26 


33 


42 


9 


599 


6 


us- 


10 


-953 


-349 


-20233 


Sequence 


20233, A 


27 


32 


41 


6 


134 


6 


us- 


10 


-953 


-349 


-38480 


Sequence 


38480, A 


28 


32 


41 


6 


188 


6 


us- 


10 


-953 


-349 


-36581 


Sequence 


36581, A 


29 


32 


41 


6 


242 


6 


us- 


10 


-953 


-349 


-28679 


Sequence 


28679, A 


30 


32 


41 


6 


288 


6 


us- 


10 


-953 


-349 


-28678 


Sequence 


28678, A 


31 


32 


41 


6 


321 


7 


us- 


11 


-140 


-450 


-66 


Sequence 


66, Appl 


32 


32 


41 


6 


331 


6 


us- 


10 


-953 


-349 


-28677 


Sequence 


28677, A 


33 


32 


41 


6 


369 


6 


us- 


10 


-953 


-349 


-7986 


Sequence 


7986, Ap 


34 


32 


41 


6 


477 


6 


us- 


10 


-505 


-928 


-515 


Sequence 


515, App 


35 


32 


41 


6 


516 


6 


us- 


10 


-953 


-349 


-7985 


Sequence 


7985, Ap 


36 


32 


41 


6 


547 


6 


us- 


10 


-953 


-349 


-7984 


Sequence 


7984, Ap 


37 


32 


41 


6 


574 


7 


us- 


11 


-293 


-697 


-2802 


Sequence 


2802, Ap 


38 


32 


41 


6 


1043 


7 


us- 


11 


-293 


-697 


-3097 


Sequence 


3097, Ap 


39 


32 


41 


6 


1336 


7 


us- 


11 


-106 


-014 


-92 


Sequence 


92, Appl 


40 


31.5 


40 


9 


196 


7 


us- 


11 


-293 


-697 


-3347 


Sequence 


3347, Ap 


41 


31 


40 


3 


102 


7 


us- 


11 


-293 


-697 


-4024 


Sequence 


4024, Ap 


42 


31 


40 


3 


115 


6 


us- 


10 


-953 


-349 


-10173 


Sequence 


10173, A 


43 


31 


40 


3 


115 


7 


us- 


11 


-293 


-697 


-3614 


Sequence 


3614, Ap 


44 


31 


40 


3 


117 


7 


us- 


11 


-293 


-697 


-2985 


Sequence 


2985, Ap 


45 


31 


40. 


3 


123 


6 


us- 


10 


-953 


-349 


-40076 


Sequence 


40076, A 



ALIGNMENTS 



RESULT 1 

US-11-293-697-2989 

; Sequence 2989, Application US/11293697 
; Publication No. US20060105376A1 
; GENERAL INFORMATION: 

; APPLICANT: HELIX RESEARCH INSTITUTE 

; TITLE OF INVENTION: Novel full length cDNA 

; FILE REFERENCE: HI -AO 106 

/ CURRENT APPLICATION NUMBER: US/11/293 , 697 



; CURRENT FILING DATE : 2005-12-05 

; PRIOR APPLICATION NUMBER: US/10/108 , 260 

; PRIOR FILING DATE: 2002-03-28 

; NUMBER OF SEQ ID NOS : 5458 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 2989 

LENGTH: 221 

TYPE: PRT 

ORGANISM: Homo sapiens 
US-11-293-697-2989 

Query Match 50.6%; Score 39; DB 7; Length 221; 

Best Local Similarity 57.1%; Pred. No. 3; 

Matches 8; Conservative 2; Mismatches 4; Indels 0; Gaps 0; 

Qy 2 QRTS I RATEGCLPS 15 

:|| = II I I I I 
Db 175 KRTPLCATAPCLPS 188 



Search completed: June 6, 2006, 05:36:46 
Job time : 15 sees 

GenCore version 5.1.9 
Copyright (c) 1993 - 2006 Bioccelerat ion Ltd. 



OM protein - protein search, using sw model 
Run on: 



June 6, 2006, 05:16:10 ; Search time 38 Seconds 

(without alignments) 
37.980 Million cell updates/sec 



Title: US-10-655-562A-4 
Perfect score: 77 

Sequence: 1 KQRTS I RATEGCLPS 15 

Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0 . 5 



Searched: 



283416 seqs, 96216763 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



283416 



Database 



PIR_80:* 
1: pirl:* 

pir2 : * 
pir3 : * 
pir4 : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 



1 


45 


58. 


.4 


244 


2 


S48492 


DCG1 protein - yea 


2 


42 


54, 


. 5 


854 


1 


VCLJSI 


env polyprotein pr 


3 


40.5 


52. 


.6 


312 


2 


C71136 


hypothetical prote 


4 


40 


51. 


, 9 


290 


2 


F81700 


DNA polymerase III 


5 


39 


50, 


. 6 


160 


2 


S56204 


probable membrane 


6 


39 


50, 


. 6 


331 


2 


G69200 


conserved hypothet 


7 


38 


49. 


. 4 


191 


2 


T49232 


hypothetical prote 


8 


38 


49. 


.4 


968 


2 


C82452 


hypothetical prote 


9 


38 


49. 


. 4 


1220 


2 


T32916 


hypothetical prote 


10 


37 


48. 


. 1 


188 


2 


T20235 


hypothetical prote 


11 


37 


48. 


. 1 


340 


2 


T25919 


hypothetical prote 


12 


37 


48 . 


. 1 


359 


2 


T21247 


hypothetical prote 


13 


37 


48 . 


. 1 


487 


2 


F70765 


hypothetical prote 


14 


37 


48. 


. 1 


642 


2 


D64491 


hypothetical prote 


15 


37 


48, 


. 1 


1071 


2 


D86279 


hypothetical prote 


16 


36 


46. 


.8 


113 


2 


AH2677 


hypothetical prote 


17 


36 


46. 


.8 


169 


2 


C87610 


conserved hypothet 


18 


36 


46, 


.8 


206 


2 


T16153 


hypothetical prote 


19 


36 


46 


.8 


275 


2 


B97323 


multidrug-ef f lux t 


20 


36 


46. 


.8 


308 


2 


T24912 


hypothetical prote 


21 


36 


46 


.8 


325 


2 


F83503 


hypothetical prote 


22 


36 


46 


.8 


346 


2 


F70666 


probable alcohol d 


23 


36 


46. 


.8 


482 


2 


H86447 


hypothetical prote 


24 


36 


46. 


.8 


665 


2 


H87468 


ubiquinol oxidase 


25 


36 


46. 


.8 


993 


2 


C55226 


cylM protein - Ent 


26 


36 


46. 


.8 


2351 


2 


G71415 


hypothetical prote 


27 


36 


46. 


.8 


2567 


2 


A49551 


filamin, Muller ce 


28 


36 


46 


. 8 


4006 


2 


T09070 


probable tenascin 


29 


35.5 


46. 


, 1 


644 


2 


T24366 


hypothetical prote 


30 


35.5 


46. 


. 1 


679 


2 


T24365 


hypothetical prote 


31 


35 


45, 


.5 


147 


2 


A75196 


hypothetical prote 


32 


35 


45, 


. 5 


189 


2 


T19559 


hypothetical prote 


33 


35 


45, 


.5 


273 


2 


E95095 


hypothetical prote 


34 


35 


45, 


.5 


287 


2 


C75635 


pho spho eno lpy ruva t 


35 


35 


45. 


.5 


335 


2 


S25212 


prsG protein - Esc 


36 


35 


45. 


.5 


335 


2 


S25229 


G-minor fimbrial p 


37 


35 


45. 


.5 


375 


2 


H97560 


alcohol dehydrogen 


38 


35 


45. 


.5 


375 


2 


AH2781 


alcohol dehydrogen 


39 


35 


45. 


.5 


384 


2 


S25771 


gasl protein - mou 


40 


35 


45. 


.5 


409 


2 


E91246 


probable L-sorbose 


41 


35 


45. 


5 


413 


2 


B86094 


probable L-sorbose 


42 


35 


45. 


5 


524 


2 


C81367 


pho s pho eno 1 py ru va t 


43 


35 


45. 


,5 


529 


2 


S12787 


potassium channel 


44 


35 


45. 


5 


530 


2 


JH0167 


potassium channel 


45 


35 


45. 


5 


555 


1 


RGASWA 


regulatory protein 



ALIGNMENTS 



RESULT 1 
S48492 

DCG1 protein - yeast (Saccharomyces cerevisiae) 
N;Alternate names: protein YIR030C 
C; Species: Saccharomyces cerevisiae 

C;Date: 02-Dec-1994 #sequence_revision 02-Dec-1994 #text_change 09-Jul-2004 
C;Accession: S48492; S19038 
R; Rowley, K. 

submitted to the EMBL Data Library, October 1994 
A; Reference number: S4 8478 
A /Accession: S484 92 
A; Molecule type: DNA 
A; Residues: 1-244 <ROW> 

A; Cross-references: UNIPROT: P32460; UNI PARC :UPI0000128FAE; GB:Z47047; 
EMBL:Z38061; NID:g603997; PID:g763375; MIPS:YIR030c 
R;Yoo, H.S.; Cooper, T.G. 
Gene 104, 55-62, 1991 

A; Title: Sequences of two adjacent genes, one (DAL2) encoding allantoicase and 
another (DCG1) sensitive to nitrogen-catabolite repression in Saccharomyces 
cerevisiae. 

A;Reference number: JH0442; MUID: 92009196 ; PMID: 1916277 

A; Access ion: SI 9 03 8 

A; Molecule type: DNA 

A;Residues: 1-126, 'C , 128-244 <YOO> 

A; Cross -references: UNIPARC:UPI000017923A; GB:M64719 

C;Genetics : 

A; Gene: SGD:DCG1 

A; Cross-references : SGD:S0001469; MIPS:YIR030c 
A; Map position: 9R 

C;Superfamily : Saccharomyces cerevisiae DCG1 protein 
C; Keywords: transmembrane protein 

F; 221-237/Domain : transmembrane #status predicted <TMM> 

Query Match 58.4%; Score 45; DB 2; Length 244; 

Best Local Similarity 61.5%; Pred. No. 1.6; 

Matches 8; Conservative 2; Mismatches 3; Indels 0; Gaps 0; 

Qy 2 QRTS I RATEGCLP 14 

I ilh: | Ml 
Db 51 QETS I KSMEACLP 63 



Search completed: June 6, 2006, 05:21:39 
Job time : 41 sees 

GenCore version 5.1.9 
Copyright (c) 1993 - 2006 Biocceleration Ltd. 



OM protein - protein search, using sw model 



Run on: 



June 6, 2006, 05:12:45 ; Search time 294 Seconds 

(without alignments) 
47.195 Million cell updates/sec 



Title: US-10-655-562A-4 
Perfect score: 77 

Sequence: 1 KQRTS I RATEGCLPS 15 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 2849598 seqs, 925015592 residues 

Total number of hits satisfying chosen parameters: 2849598 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : UniProt_7 . 2 : * 

1 : uniprot_sprot : * 
2 : uniprot_trembl : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 



1 


48 


62 


.3 


561 


2 


Q8SV66_ENCCU 


Q8sv66 


encephalito 


2 


46 


59 


.7 


105 


2 


Q4YRP8 PLABE 


Q4yrp8 


Plasmodium 


3 


45 


58 


.4 


86 


2 


Q3J3C1_RH0S4 


Q3j3cl 


rhodobacter 


4 


45 


58 


.4 


244 


1 


DCG1_YEAST 


P32460 


saccharomyc 


5 


44 


57 


. 1 


103 


2 


Q2QQU0 ORYSA 


Q2qqu0 


oryza sativ 


6 


44 


57 


.1 


103 


2 


Q33AN9_ORYSA 


Q33an9 


oryza sativ 


7 


42.5 


55. 


.2 


1184 


2 


Q57ZH0 9TRYP 


Q57zh0 


trypanosoma 


8 


42 


54 


.5 


117 


2 


Q84SP6_ORYSA 


Q84sp6 


oryza sativ 


9 


42 


54 


.5 


249 


2 


Q9BPY7 HUMAN 


Q9bpy7 


homo sapien 


10 


42 


54 


.5 


267 


2 


Q7L5R2 HUMAN 


Q715r2 


homo sapien 


11 


42 


54 


.5 


267 


2 


Q9BY14_HUMAN 


Q9byl4 


homo sapien 


12 


42 


54 


.5 


292 


2 


Q4C9D8_CROWT 


Q4c9d8 


crocosphaer 


13 


42 


54 


.5 


854 


1 


ENV_SIVCZ 


P17281 


chimpanzee 


14 


42 


54 


.5 


987 


1 


SYV_RH0S4 


Q3j4z5 


rhodobacter 


15 


41 


53 


.2 


58 


2 


Q5N9P0_ORYSA 


Q5n9p0 


oryza sativ 


16 


41 


53 


.2 


238 


2 


Q3 RPZ7__RALME 


Q3rpz7 


ralstonia m 


17 


41 


53 


.2 


292 


2 


Q65WI6_MANSM 


Q65wi6 


mannheimia 


18 


41 


53. 


.2 


307 


2 


Q8KZT6 PSEPU 


Q8kzt6 


pseudomonas 


19 


41 


53. 


.2 


357 


2 


Q 9 F F V4 _ARATH 


Q9ffv4 


arabidopsis 


20 


41 


53. 


.2 


426 


2 


Q6NKY2_ARATH 


Q6nky2 


arabidopsis 


21 


41 


53. 


.2 


485 


2 


Q9SI78 ARATH 


Q9si78 


arabidopsis 


22 


41 


53. 


,2 


494 


2 


Q2U0Zl_ASPOR 


Q2u0zl 


aspergillus 


23 


41 


53. 


,2 


608 


2 


Q519L2_ENTHI 


Q51912 


entamoeba h 


24 


40.5 


52. 


.6 


310 


2 


Q8U307_PYRFU 


Q8u307 


pyrococcus 


25 


40.5 


52. 


.6 


312 


2 


058585_PYRHO 


058585 


pyrococcus 



26 


40 


51. 


9 


181 


2 


Q851P7_ORYSA 


Q851p7 


oryza sativ 


27 


40 


51. 


, 9 


213 


2 


Q5JMM9JDRYSA 


Q5jmm9 


oryza sativ 


28 


40 


51. 


9 


253 


2 


Q3H316_9ACTO 


Q3h316 


nocardioide 


29 


40 


51. 


9 


290 


2 


Q9PKK6_CHLMU 


Q9pkk6 


chlamydia m 


30 


40 


51. 


.9 


322 


2 


Q55ZJ6_CRYNE 


Q55zj6 


cryptococcu 


31 


40 


51. 


9 


322 


2 


Q5KNW0_CRYNE 


Q5knw0 


cryptococcu 


32 


40 


51. 


9 


358 


2 


Q5SV06 MOUSE 


Q5sv06 


mus musculu 


33 


40 


51. 


.9 


368 


2 


Q550W6_DICDI 


Q550w6 


dictyosteli 


34 


40 


51. 


9 


368 


2 


Q86KQ3 DICDI 


Q86kq3 


dictyosteli 


35 


40 


51. 


.9 


539 


2 


Q2VJ46 9VIRU 


Q2vj46 


rat adeno-a 


36 


40 


51. 


.9 


605 


2 


Q2VJ47 9VIRU 


Q2vj47 


rat adeno-a 


37 


40 


51. 


.9 


734 


2 


Q2VJ48 9VIRU 


Q2vj48 


rat adeno-a 


38 


40 


51. 


9 


759 


2 


Q4WL4 5_ASPFU 


Q4wl45 


aspergillus 


39 


40 


51. 


.9 


775 


2 


Q3JKW4 BURP1 


Q3jkw4 


burkholderi 


40 


40 


51. 


.9 


935 


2 


Q3EET2 ACTSC 


Q3eet2 


actinobacil 


41 


40 


51. 


.9 


1369 


2 


Q4 Q8A1_LE I MA 


Q4q8al 


leishmania 


42 


40 


51, 


.9 


23015 


2 


Q8IQ18_DROME 


Q8iql8 


drosophila 


43 


39 


50, 


.6 


121 


2 


Q6ZN48_HUMAN 


Q6zn48 


homo sap i en 


44 


39 


50, 


,6 


132 


2 


Q4C816 CROWT 


Q4C816 


crocosphaer 


45 


39 


50, 


.6 


160 


1 


YFF1 YEAST 


P43552 


saccharomyc 



ALIGNMENTS 



RESULT 1 
Q8SV66_ENCCU 

ID Q8SV66_ENCCU PRELIMINARY; PRT; 561 AA. 

AC Q8SV66; 

DT 01-JUN-2002, integrated into UniProtKB/TrEMBL. 

DT 01-JUN-2002, sequence version 1. 

DT 07-FEB-2006, entry version 13. 

DE Hypothetical protein ECU06_1590. 

GN OrderedLocusNames=ECU06_1590 ; 

OS Encephalitozoon cuniculi. 

OC Eukaryota; Fungi; Microsporidia; Unikaryonidae; Encephalitozoon. 

OX NCBI_TaxID=6035 ; 

RN [1] 

RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA] . 

RC STRAIN=GB-M1 ; 

RX MEDLINE=21576510; PubMed=11719806 ; DOI =10 . 1038/35106579 ; 

RA Katinka M.D., Duprat S., Cornillot E., Metenier G., Thomarat F. , 

RA Prensier G., Barbe v., Peyretaillade E., Brottier P., Wincker P., 

RA Delbac F., El Alaoui H., Peyret P., Saurin W. , Gouy M., 

RA Weissenbach J., Vivares CP.; 

RT "Genome sequence and gene compaction of the eukaryote parasite 

RT Encephalitozoon cuniculi . " ; 

RL Nature 414:450-453(2001). 

CC 

CC Copyrighted by the UniProt Consortium, see http://wvsrw.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; AL590446; CAD25520.1; -; Genomic_DNA. 

DR InterPro; IPR002885; PPR. 

DR TIGRFAMS; TIGR00756; PPR; 1. 

KW Complete proteome; Hypothetical protein. 

SQ SEQUENCE 561 AA; 64550 MW; 238E55A1C1C09184 CRC64 ; 



Query Match 62.3%; Score 48; DB 2; Length 561; 

Best Local Similarity 53.3%; Pred. No. 5.6; 

Matches 8; Conservative 4; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 KQRTS I RATEGCLPS 15 

hi : = | llllh 
Db 155 KRREMLKAMEGCLPN 169 



Search completed: June 6, 2006, 05:20:54 
Job time : 297 sees 



